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Abstract 

Background: In the United Kingdom, the thermophilic Campylobacter species C jejuni and C coli are the most 
frequent causes of food-borne gastroenteritis in humans. While campylobacteriosis is usually a relatively mild 
infection, it has a significant public health and economic impact, and possible complications include reactive 
arthritis and the autoimmune diseases Guillain-Barre syndrome. The rapid developments in "omics" technologies 
have resulted in the availability of diverse datasets allowing predictions of metabolism and physiology of 
pathogenic micro-organisms. When combined, these datasets may allow for the identification of potential 
weaknesses that can be used for development of new antimicrobials to reduce or eliminate C. jejuni and C coii 
from the food chain. 

Results: A metabolic model of C jejuni was constructed using the annotation of the NCTC 1 1 168 genome 
sequence, a published model of the related bacterium IHelicobacter pylori, and extensive literature mining. Using 
this model, we have used in silico Flux Balance Analysis (FBA) to determine key metabolic routes that are essential 
for generating energy and biomass, thus creating a list of genes potentially essential for growth under laboratory 
conditions. To complement this in silico approach, candidate essential genes have been determined using a whole 
genome transposon mutagenesis method. FBA and transposon mutagenesis (both this study and a published 
study) predict a similar number of essential genes (around 200). The analysis of the intersection between the three 
approaches highlights the shikimate pathway where genes are predicted to be essential by one or more method, 
and tend to be network hubs, based on a previously published Campylobacter protein-protein interaction network, 
and could therefore be targets for novel antimicrobial therapy. 

Conclusions: We have constructed the first curated metabolic model for the food-borne pathogen Campylobacter 
jejuni and have presented the resulting metabolic insights. We have shown that the combination of in silico and in 
vivo approaches could point to non-redundant, indispensable genes associated with the well characterised 
shikimate pathway, and also genes of unknown function specific to C jejuni, which are all potential novel 
Campylobacter intervention targets. 




Genomics 



Background 

The rise of antibiotic resistance in pathogenic bacteria is 
a growing concern in the developed world necessitating 
knowledge-led approaches to identify new interventions 
and prevention strategies [1]. One of the common 
sources of pathogenic bacteria is food, with the food- 
borne zoonotic pathogens Salmonella, Escherichia coli 
and Campylobacter being prime examples. Although it 
can be contended whether the use of antibiotics in the 
food industry contributes to antimicrobial resistance, it 
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is clear that food-borne pathogens also increasingly 
acquire resistance to antimicrobial interventions. Multi- 
drug resistance in Salmonella is well documented [2,3]. 
For Listeria, antibiotic resistance has also been reported 
for strains isolated from food [4]. In Campylobacter, 
resistance to ampicillin, erythromycin, tetracycline, and 
ciprofloxacin have all been reported [5-7]. 

In Europe, Campylobacter was the most frequent 
cause of food-borne illness in 2007, with over 200,000 
laboratory-confirmed cases [8] although the total num- 
ber of cases is thought to be approximately eightfold 
higher. Infection by Campylobacter is thought to be lar- 
gely due to the consumption of contaminated poultry 
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either through poor food preparation hygiene or under- 
cooking [9]. While the symptoms associated with C. 
jejuni infection (diarrhoea, vomiting, and stomach pains) 
often only last between 2 to 5 days, sequelae of C. jejuni 
infection include more serious autoimmune diseases like 
Guillain-Barre syndrome, Miller-Fisher syndrome [9], 
and reactive arthritis [10]. While human infection often 
does not require antibiotic intervention, the organism is 
endemic in poultry and farm animals, and it would be 
advantageous to have treatment options before entry in 
the food chain. 

One approach for the identification of new antibiotic 
targets for a particular bacterial pathogen is to identify 
non-redundant cellular functions or metabolic pathways 
that are indispensible for growth and/or survival of that 
organism; for example, key metabolic enzymes or cell 
wall synthesis proteins. In the post-genomic era, genome 
analysis makes both bioinformatic predictions and tar- 
geted mutagenesis strategies feasible, due to the avail- 
ability of large, curated datasets. However, genome 
annotation is often incomplete and incorrect, and meta- 
bolic redundancy (alternative pathways or catalytic activ- 
ities) can confound such rational approaches. For 
instance, a comprehensive study in Salmonella of essen- 
tial genes required during infection showed that many 
enzymes are not essential, partly due to metabolic 
redundancy [11]. An alternative (experimental, high- 
throughput) strategy is the use of random approaches 
such as transposon mutagenesis to identify essential 
genes that would be needed to infect an animal model 
or to grow and proliferate [12-14]. 

In silico, essential genes of well characterised micro- 
organisms, such as E. coli, are predicted with high accu- 
racy by Flux Balance Analysis (FBA) of the metabolic 
network [15]. FBA consists of the computation of the 
fluxes going through the metabolic reactions when the 
cells are in a homeostatic state [15]. The reactions are 
linked to the genes that encode the corresponding 
enzymes with Boolean relationships, and, a gene is pre- 
dicted to be essential if in silico deletion results in negli- 
gible biomass [15]. According to Feist et at., [16], about 
90% of essential genes of E. coli can be predicted in a 
given environment. For micro-organisms other than the 
established model systems, the accuracy is lower (60- 
80%) [17,18]. Nevertheless, it provides insights into cel- 
lular metabolism, which can be useful to identify poten- 
tial new drug targets [19]. 

In this study we have constructed a genome scale 
metabolic model of the food-borne pathogen Campylo- 
bacter jejuni and discuss this organism's metabolism. 
Additionally, we have combined both in silico and in 
vivo approaches to make predictions about essential 
genes. A published FBA model of the closely related 
organism Helicobacter pylori [18] provided the basis for 



a Campylobacter FBA model. The essential genes pre- 
dicted from FBA of the reconstructed genome scale 
model of C. jejuni were compared to new experimen- 
tally generated transposon mutagenesis data, and a 
recently published independent whole genome transpo- 
son mutagenesis study [20]. Whilst the overlap between 
the three methods is comparatively small, the combina- 
tion of these methods has drawn focus to the shikimate 
pathway, a known target pathway for new antimicrobial 
interventions in other bacterial pathogens [21]. 

Results 

Metabolic network reconstruction 

In a first step, we have reconstructed the metabolic net- 
work of C. jejuni based on the annotation of the NCTC 
11168 genome sequence [22,23], a published model of 
the related bacterium H. pylori [18] and extensive litera- 
ture mining. The resulting network is made of 536 reac- 
tions in total accounting for 388 genes and 467 
metabolites, that is, it is of similar size as the model of 
H. pylori (see Table 1 for the number of reactions in 
this model as compared with the H. pylori model and 
the Additional file 1: model for a table with the model 
itself). In this section we will discuss specific areas of C. 
jejuni metabolism as present in, or predicted by, our 
model. 

Central metabolism 

The reactions of the central metabolism have been 
mainly drawn from the literature since contrarily to H. 
pylori, C. jejuni is predicted to contain a complete TCA 
cycle with some enzymes characteristic for anaerobes 
[24]. C. jejuni does not metabolise glucose [25] and the 
genome annotation suggests that the Embden-Meyerhof 
pathway only functions in gluconeogenesis [24], so over- 
all the space of solution of fluxes is different from H. 
pylori. The respiratory chain of C. jejuni is more com- 
plex than that of H. pylori [26], and C. jejuni can use 
sulphite as an electron donor [27]. 

The biosynthesis of folate results in the production of 
glycolaldehyde, which can cause cell damage by electro- 
philic attack of negatively-charged molecules [28]. In the 
H. pylori and E. coli models, glycolaldehyde is converted 
to glycolate by a glycolaldehyde dehydrogenase and 

Table 1 Number of reactions in the C. jejuni model (this 
paper) and in the H. pylori model [18]. 



Type of reaction C. jejuni model H. pylori model 



Metabolic 


410 (367) 


399 (317) 


Transport 


64 (32) 


77 (38) 


Sink/demands 


3 


3 


Excliange 


59 


74 


Total 


536 


553 



The number of reactions associated to genes is given in bracl<ets. 
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glycolate is degraded further through glyoxylate metabo- 
Usm [16,18]. Alternatively glycolate can diffuse out of 
the cell in the case of E. coli [16]. Since a glyoxylate oxi- 
dase (Cjl213c) has been annotated [23] and since glyox- 
ylate has been shown to be degraded via a malate 
synthase in H. pylori [29], we assumed a malate synthase 
in our model despite the genome lacking the corre- 
sponding annotation as in H. pylori [30,31]. 
Amino acid metabolism 

The reactions for the amino acid metabolism have been 
mainly drawn from the genome annotation. Contrarily 
to H. pylori, C. jejuni seems to have the capacity to 
synthesize all the amino acids and vitamins it requires. 
This was shown experimentally with BIOLOG pheno- 
type microarrays, where respiration was detected on 
growth medium containing salts and a carbon source 
only [32]. From the genome sequence, complete path- 
ways for the synthesis of isoleucine, leucine and valine, 
are present in C. jejuni, whereas these pathways are 
incomplete or absent in H. pylori. In addition to the 
amino acids required by H. pylori, a complete pathway 
to synthesize histidine was found in C. jejuni {cj0317, 
cjlSlSc, cjl597-99, cjl600-01, cjl603-04) while orthologs 
of these genes are absent in the H. pylori genome. The 
gene for the last step of methionine synthesis is pre- 
dicted to be present in C. jejuni {cjl201), so no demand 
reaction was included in the model to artificially con- 
sume S-adenosylmethione. In common with H. pylori, 
only one gene of the methionine salvage pathway was 
found to be present in C. jejuni {cjOllT). In the iIT341 
GSM/GPR H. pylori model, the pathway was neverthe- 
less included to ensure the recycling of 5-methylthioade- 
nosine, a by-product of spermidine biosynthesis, to 
methionine, based on the assumption that the pathways 
vary from one micro-organism to another. However it 
has recently been suggested that the last steps of sper- 
midine biosynthesis in C. jejuni differ from the pathway 
proposed for H. pylori [33] eliminating the necessity for 
recycling 5-methylthioadenosine, so the methionine sal- 
vage pathway was not included in this model. 
Nucleotide metabolism 

The reactions for the nucleotide metabolism have been 
almost exclusively derived from the genome annota- 
tion. The pathway for the synthesis of IMP is more 
similar to that of E. coli, rather than that of H. pylori. 
Only a few genes have been annotated at the level of 
nucleotide inter-conversions such as cj0293 which has 
been predicted to encode for a nucleotidase [23]. How- 
ever most of the products of the reactions catalysed by 
this enzyme are not reutilised in the metabolic net- 
work, they are dead-ends. So either cj0293 is incor- 
rectly annotated, or genes encoding enzymes to utilize 
the products of the nucleotidase have not been 
annotated. 



Vitamin and cofactor metabolism 

In the H. pylori model iIT341 GSM/GPR, it was 
assumed that pimelate diffuses into the cell and that the 
first step of the synthesis of biotin is catalysed by pime- 
lyl-CoA synthetase although no locus was found for 
such a gene [18]. Having no better alternative, we kept 
the assumption of the H. pylori model. C. jejuni does 
not contain ubiquinone, but uses menaquinone 6 and a 
methyl-substituted menaquinone [34]. It has been 
shown that for Streptomyces coelicolor, H. pylori, C. 
jejuni and Thermus thermophilus, the pathway for 
menaquinone synthesis diverges from the one from E. 
coli with futalosine as an intermediate [35]. However the 
pathway is not completely elucidated so in this model, 
the equations of the pathway of E. coli were kept as in 
the model iIT341 GSM/GPR. As for the ubiquinone, the 
same three gene orthologs of the genes present in H. 
pylori have been predicted to be present in C. jejuni. 
Thiamine is essential for the growth of some strains of 
Campylobacter [36] but not for the strain NCTC11168 
[32]. According to the genome annotation, the pathway 
for its synthesis is complete. 
Cell wall metabolism 

There is little evidence about the composition of the cell 
wall of C. jejuni in the literature, and the genome anno- 
tation suggests that the pathways are neither those of E. 
coli nor those of H. pylori. For instance, only 2 genes 
are annotated as part of the fucose biosynthesis pathway 
in C. jejuni {cjl407c and cjl428c), while 6 such genes 
are annotated in H. pylori. However in the absence of 
better data, the pathways for the cell wall metabolism 
were copied from the model iIT341 GSM/GPR. It has 
been shown that the fatty acid composition of C. jejuni 
changes with the environmental conditions and/or the 
growth rate [37,38] but the predominant fatty acid have 
been reported to be the saturated hexadecanoic acid, the 
unsaturated octodec-ll-enoic acid and to a lesser extent 
cyclopropane in C19 and tetradecanoic acid which 
increased with stress [37-40]. These are the same fatty 
acids that are included in the model iIT341 GSM/GPR. 
Other 

It is not clear from the genome sequence nor from the 
literature how C. jejuni assimilates sulphur, and hence 
we have used the assumptions used for H. pylori [18]. 
Transport reactions were mainly drawn from circum- 
stantial evidence. Three "sink reactions" were added to 
the model because the pathways for the degradation of 
the corresponding products are unknown, and the ones 
used are the same as the ones introduced in E. coli [16] 
and H. pylori [18]. 

Exploration of the predicted metabolism of C. jejuni 

To test the model, the production of biomass from dif- 
ferent carbon sources was simulated and the results are 
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shown in the table in the Additional file 2: substrate uti- 
lisation. Out of the 19 substrates tested, three are not 
included in the model (bromosuccinate, methyl pyru- 
vate, a-hydroxybutyrate) and for four of them, it is not 
clear from the literature whether they can be used as a 
sole carbon source or not. According to the model, glu- 
tamate, citrate, a-ketoglutarate, aspartate, asparagine, L- 
lactate, L and D-malate, succinate, fumarate, pyruvate 
and serine can be metabolised in agreement with the 
data of the literature. The model also allowed the meta- 
bolism of proline and L-glutamine, which did not pro- 
duce significant respiration with the BIOLOG 
experiments [32], however they were metabolised once 
aspartate and serine were depleted, alongside other che- 
micals in the defined media [37]. It is not clear whether 
formate can be used as an electron donor only [41] or a 
carbon source as well [42,43], but cannot be used as a 
sole carbon source according to the microarray experi- 
ments [32] and is not predicted to be metabolised in the 
model. Cysteine has been shown to be metabolised 
[37,42] but was not predicted to be sufficient as a sole 
carbon source. Finally, no regulatory constraint was con- 
sidered in any of the simulations, the only constraints 
used were the rate of consumption of the carbon source 
and oxygen source. 

In silica prediction and experimental identification of 
essential genes In C. Jejuni 

The metabolic model in combination with FBA was 
used to predict the metabolic genes that are essential 
for the production of biomass in rich medium (see 
Materials and Methods). This generated a list of 176 
predicted essential genes (see Additional file 3: predic- 
tions of essential genes). 

Whilst the use of FBA to predict essential genes has 
been shown to be effective [16], it only focuses on those 
genes encoding proteins in the metabolic model. To 
make a genome-wide assessment of gene essentiality, we 
used random in vitro transposition using two transposon 
inserts to generate two transposon libraries in C. jejuni 
11168 (see Materials and Methods). The Mariner and 
Tn7 transposon libraries contained 7381 and 2169 
unique inserts, respectively. Using a PCR-based genomic 
insertion site mapping strategy with validated primers 
for each annotated ORF in the C. jejuni genome 
sequence, the insertion site of each transposon was 
mapped as shown in Figure 1 with an accuracy of +/- 
50 nt. Table 2 shows a summary of the results of the 
genome-wide transposon mutagenesis experiments. In 
total, 233 genes were identified as lacking either Tn7 or 
Mariner insert, and therefore potentially essential under 
the growth conditions used in this study. A more 
detailed list of insert locations is available as the Addi- 
tional file 3: predictions of essential genes including the 



47 directed mutants, and a summary of the proteins 
encoded by genes without a transposon insert, classified 
according to the KEGG, BioCyc, and JCVI databases, is 
shown in Additional file 4: genes without inserts by 
pathway. Such an analysis shows the predominance of 
essential genes associated with: aromatic amino acid 
metabolism, tRNA metabolism and protein synthesis, 
energy transduction/TCA cycle, the cell envelope, and 
purine and pyrimidine metabolism. There are also 27 
genes («10% of the total in the annotated genome) with 
no inserts encoding hypothetical proteins of unknown 
function. Fifteen of these hypothetical proteins do not 
have homologs outside the Epsilon sub-division of the 
Proteobacteria and tend to be small proteins (median 
size, 5.6 kDa, min size, 3.6 kDa; max size, 26.4 kDa). A 
further subset of four genes are specific to the jejuni 
species of Campylobacter {cj0416, cj0747-8, and cj0974) 
and potentially encode small polypeptides (median size, 
3.8 kDa, min size, 3.6 kDa; max size, 5.4 kDa). The 
cj0416 and cj0747-8 genes are subject to regulation as 
significant changes in transcription of these genes have 
been reported in microarray studies [44-47]. Thus far, 
no published proteomic studies in C. jejuni have identi- 
fied these proteins, although their small size may have 
hampered detection in previous studies. Analysis of 
these small proteins with respect to the published pro- 
tein-protein interaction network [48] shows that, with 
the exception of cj0344 (not present in the study), these 
proteins tend to be hubs, interacting with a large num- 
ber of other proteins. For instance, interactions between 
cj0748 and 115 other proteins (including cj0974) are 
detected in the published network. 

The two lists of predicted essential genes, one result- 
ing from FBA, and other from global in vitro transposi- 
tion, were compared to assess both the intersection and 
differences. A comparison of these lists shows an inter- 
section of 42 genes (Table 3). Assuming that the in vitro 
transposition is not biased, the model would achieve 
60% accuracy (percentage of the total number of genes 
predicted to be essential or non essential by both meth- 
ods). Figure 2 shows the distribution of the number of 
genes predicted to be essential or dispensable as pre- 
dicted by FBA (in silico) and measured by the number 
of insert {in vivo) by metabolic pathway. Some of the 
genes predicted to be essential by FBA belonged to cen- 
tral metabolism, especially the gluconeogenesis pathway. 
Specifically, each step from phosphoenol-pyruvate to 
glucose-6-phosphate is predicted to be essential by FBA. 
However, the transposon mutagenesis only identified 3 
genes involved in gluconeogenesis: cjl403c (glyceralde- 
hyde 3-phosphate dehydrogenase), cj0597 (fructose- 
bisphosphate aldolase), and cj0840c (fructose- 1,6-bispho- 
sphatase). Nine of the 23 genes predicted to be dispen- 
sable by FBA that did not contain transposons, encode 
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Figure 1 Circular representation of the C. jejuni NCTC11168 genome showing positions of transposon inserts Outer 2 circles: open 
reading frames (purple) and pseudogenes {blacl<) on the forward and reverse DNA strands. The next two circles show insert positions based on 
the Mariner library (dark red) and the Tn5 library (light red). The innermost circle shows %GC. Numbers on the outermost circle show the base 
number with respect to the start of dnaA {cJOOOl). Box outs show examples of the transposon mutagenesis analysis (genes are not drawn to 
scale). Genes are predicted essential if they lack any transposon insert (e.g. alaS or aroA) while genes with an insert are predicted to be 
dispensable for growth under laboratory conditions. The upper box out shows a region of the genome densely populated by transposon inserts 
and therefore this whole region is likely dispensable for growth under laboratory conditions. This region of the chromosome encodes functions 
relating to cell envelope biosynthesis and flagella modification. 



proteins associated with respiration {cJ0107, cj0936, 
CJ1153, cjl490c, cJ1566c, cjl567c, cjl569c, cjl571c and 
cjl572c), confirming that the respiration chain of C. 
jejuni, whilst being essential for viability, is complex and 
not fully understood [26]. Concerning the vitamins and 
cofactors, predictions by FBA are uncertain because 
only verified transport reactions were included in the 
FBA model. In addition, the range of chemicals allowed 



Table 2 Summary of the genome wide transposon 
mutagenesis for C. jejuni 



Transposon library 


Tn7 


Mariner 


Combined 


Number of inserts measured 


2251 


7654 


9905 


Number of inserts in genes 


2169 


7380 


9549 


Percentage of inserts within genes 


96.35 


96.42 


9640 


Number of genes with 1 or more inserts 


1029 


1240 


1421 


Number of genes with no insert 


625 


414 


233 



in the medium for the FBA simulation was conservative 
because the medium composition is not well defined. 
This might explain the discrepancies for the biosynthesis 
of biotin, folic acid and pantothenate where most of the 
genes associated with the pathways were predicted to be 
essential by FBA, but were not essential according to 
the transposon mutagenesis. Riboflavin was assumed to 
be present in the medium and used by the cell [49] so it 
was predicted non-essential by FBA in agreement with 
our transposon mutagenesis data. In the transposon 
study of Stahl and Stintzi [20], they proposed that ribo- 
flavin biosynthesis is an essential pathway, perhaps 
because they used a different medium. Although ubiqui- 
none has not been isolated from C. jejuni [34], the gen- 
ome is predicted to encode three proteins that 
participate in ubiquinone biosynthesis [23]. Cj0324 
could be attributed to the synthesis of menaquinone but 
cj0546 (UbiD) did not have any transposon inserts 
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Table 3 List of essential "metabolic genes" according to the different techniques 

Technique Essential genes 

FBA only g0024, q0026c, cj0066c, cj0075c, cj0127c, q0146c, cj0172c, q0187c, cj0194, cj0196c, cj0197c, cj0205, cj0227, cj0237, 

cj0240c, C]0273, cj0274, cj0286c, cj0288c, q0296c, q0297q q0298q q0306c, q0307, cj0321, c]0326, q0332q q0360, 
cj0384q cj0405, q0432q q0433q cj0434 q0435, q0437, cj0443, q0453, q0490, q0541, cj054Z q0559 q0576, 
cjOSSOq q0585, q0589 q0638q q0647 cj0699q q0716, cj0764q q0766c, cj0767q cj0795q cj0798q q0806, cj0813, 
q0821, q0822, q0847, q0853c, q0858q q0862q q0891q q0905c, cj09Wq q0932q q0947q q0949q q0955q 
cJ0992q q0995q cjl039, ql044q cjl046q ql048q qW67 qW80q qW81q ql088q cjl096q qll04 cJUMq 
qll33. qU49q qUSOc, qllSlq qU52q qU77q qU83q qll96q cjll98. qUOl qUUq ql238 ql243, cjl248, 
ql364q ql398, cjMOOc, cjMOlq cjl402q ql404 ql407q ql424q qM28c, ql476q ql498q ql515q ql529c, 
ql530, ql605q ql634c, ql641, q1645, cjl672q ql685c 

FBA & transposon mutagenesis of cj0027 cjOllS, qOlU, cj0324, q0387 q0394q q0442 q0514, cj0516, q0581, q0597, cj0639q q064l cj0686, 
this study q0840q q0894q cj0927, qW08c, cjl045q ql054q qW58c, ql082q cjll31q ql250, cjl274q ql288q ql346q 

ql347q ql366q cjl403q ql443c, ql536c, ql607, qWW, ql652c 

FBA & transposon mutagenesis of gO/32, Cj0147q cj0308c, cj0356q cj0451, cj0503c, cj0722q cJ0845c, cj0861c, cj0925, cjUOl, qU39, cjl290q qU91q 
Stahl and Stintzi ql53l ql535c 

FBA & tlie 2 transposon mutagenesis cj0231q q0545, Cj0707, q0810, q0855, cj0895c, cjl644, ql676 
methods 

The "metabolic genes" refer to the genes that were included in the FBA model. This table gives the list of the genes shown in Figure 3. The genes predicted to 
be essential by transposon mutagenesis were those with no insert in the transposon mutagenesis in this study and in the study of Stahl and Stintzi [20]. 



suggesting this gene has an unlcnown but important 
function. 

One important caveat of the essential gene predictions 
using FBA is that some pathways were copied from H. 
pylori. In addition many genes annotated to be part of 
the cell wall were not taken into account in the model. 
However many genes involved in fatty acid metabolism 
and all the genes involved in fatty acid elongation are 
predicted to be essential by FBA: ci0328c (FabH), cjl303 
(FabH2), cj0442 (FabF), cj0435 (FabG), cJ0273 (FabZ), 
and cjMOOc (FabI). FabD (cj0116) and FabF are also pre- 
dicted essential from the transposon study. Fatty acid 
biosynthesis in Campylobacter is likely to contribute to 
the biosynthesis of the cell envelope, so it is perhaps not 
surprising that this is a key pathway. 

Comparison of the essential gene predictions with a 
published study 

The predictions made in this study were further com- 
pared to the published transposon mutagenesis study of 
Stahl and Stintzi [20]. The greatest overlap is between 
FBA and our transposon list (35 genes) as illustrated in 
Figure 3 and listed in Table 3. There are only 26 genes 
that overlap between our transposon study and the pub- 
lished study of Stahl and Stintzi [20]. Eight genes are 
common to all three lists although they encode proteins 
from a number of disparate pathways {cj0231c (NrdF), 
cj0545 (HemC), cJ0707 (KdtA), cjOSlO (NadE), cj0855 
(FolD), cj0895c (AroA), c)1644 (IspA), cil676 (MurB)). 
An analysis of the intersection genes, with respect to the 
KEGG, JCVI, and BioCyc databases, shows clusters of 
genes associated with: purine and pyrimidine metabo- 
lism, the cell envelope, cofactor biosynthesis (menaqui- 
none/ubiquinone, folic acid, biotin, heme), and a cluster 
of five hypothetical proteins with no known function 



{cj0364, cj0703, cj0711, cj0939c, and cjl712). Following 
further analysis of these protein sequences using BLAST 
and the Pfam database, cjl712 is revealed to be a puta- 
tive PunB/DeoD homolog (purine nucleoside phosphor- 
ylase) involved in purine and pyrimidine metabolism as 
shown by the amino acid sequence containing a Phos- 
phorylase superfamily domain (E = 1.3e-07) and show- 
ing between 40-80% identity to other proteins annotated 
as PunB/purine nucleoside phosphorylases in the NCBI 
microbial database. Three genes from the aromatic 
amino acid biosynthetic pathway (specifically, the shiki- 
mate pathway) are also present in the intersection list. 
Further analysis of this pathway, shows that nine genes 
encoding proteins from this pathway are predicted to be 
essential by one or more of the techniques used in this 
study (Figure 4). Furthermore, analysis of these proteins 
with respect to the published protein-protein interaction 
network [48] reveals that many proteins from this path- 
way have a high number of interactions with other pro- 
teins and may therefore be classified as 'hubs' in the 
metabolic network (Figure 4). Interestingly, a number of 
these genes are also subject to transcriptional regulation 
in a virulence model [46]. Taken together, these data 
point to biosynthesis of aromatic amino acids as a 
potential metabolic weak-point in Campylobacter, which 
is also relevant during infection. 

Discussion 

In this study we have combined bioinformatic 
approaches to construct and validate a genome-wide 
model of metabolism of Campylobacter jejuni, the first 
such model of this important food-borne pathogen. Flux 
Balance Analysis has been used to predict those proteins 
that, if removed from the model, result in loss of bio- 
mass production. To complement this in silico 
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predictive approach, we have used random transposon 
mutagensis coupled to gene-specific PGR to identify 
those genes that contain one or more transposon insert 
(dispensable genes for growth under laboratory condi- 
tions) and those genes that do not contain a transposon 
insert, the putative essential genes. 



In silico determination of essential genes 

Although the reconstruction of the metabolic network 
of C. jejuni is based on limited biochemical data, it was 
possible to formulate a hypothesis on the metabolism of 
this pathogen. The reconstruction pointed out the main 
areas of uncertainty: the cell wall metabolism and 
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transposon mutagenesis were those with no insert in the transposon mutagenesis in this study and in the study of Stahl and Stintzi [20]. 



nucleotide pathways. It was also found that the pathway 
for sulphur assimilation is not obvious from the genome 
annotation. 

A malate synthase activity was an assumption in our 
model, and this activity has been demonstrated in H. 
pylori [29]. Based on the annotation, the genome does 
not encode a malate synthase, and extensive BLAST 
searching using both the malate synthase A (aceB) and 
malate synthase G (glcB) sequences did not reveal any 
match in the H. pylori or C. jejuni genomes or any gen- 
ome from the epsilon sub-division of Proteobacteria. A 
new class of malate synthase enzymes may be present in 
H. pylori and possibly in this clade of life, that does not 
have sequence homology to known characterised malate 



synthase enzymes from other bacteria, although this 
requires further biochemical evidence. The reconstruc- 
tion of the model was mainly based on conventional 
genome annotation employing BLAST searches. More 
sophisticated annotation methods have been proposed 
to address functional divergence amongst proteins that 
share sequence similarity [50,51]. For instance, we com- 
pared the EC numbers obtained with the EFICAz [52] 
and PRIAM [53] tools to the EC numbers of the reac- 
tions linked to a unique gene in our model (287 reac- 
tions) and found discrepancies for 30 and 27 reactions 
respectively. Based on the original genome annotation, 
some enzymes in our model could catalyse diverse reac- 
tions, while the more sophisticated annotation tools 



Metris et al. BMC Genomics 201 1, 12:535 
http://www.biomedcentral.eom/1 471 -2 1 64/1 2/535 



Page 9 of 14 



D-erythrose-4-phosphate 
|Cj0716(x) 

3-deoxy-D-arabino- 
heptulosonate-7-phosphate 

|Cj1008c {aroB)(S)0 @) 
3-Dehydroquinate 

Cj0066c (aroQ/aroD) 0 @ 
3-Dehydro-shikimate 

eoo- ^C\0405 (aroE) 0(4)1 
[ 1 Shikimate 

on 




Tryptophan 



HjN COOH 



Cj0348 (trpB) T@ 



Indole 



Cj0349 {trpA) t(3) 



Cj0387 {aroK)00(3) 



Shikimate 3-phosphate 

|cj0895c (aroA) 00® @ i 



(1S,2R)-1-C-(indol-3-yl)glycerol3- 
phosphate 

Cj0498 (trpC)(A) 

1-(o-carboxyphenylamino)-1'- 
d eoxyri bul ose-5'-p hosp hate 

Cj0347 [trpF) (is) 
N-(5'-phosphoribosyl)-anthranilate 



5-enolpyruvyl-shikimate- 

3-phosphate Cj1 634c (aroC) 



Chorismate 



Cj0345 

{trpE)®i 



Cj0346 {trpD) 
Anthranilate 



Cj0316 {pheA)0@ 



4-hydroxw)henylpyruvate 

Cj0762c {aspB) (ss) 
Cj031 7 {hisC) 
Cj1437ci 

Tyrosine 
H,N COOH 




Cj0130 
{tyrA)(Q) 



Prephenate 



Phenyl-pyruvate 

Cj0762c {aspB) (ss) 
Cj031 7 (/?/sC) 
Cj1437ci 



Cj0316 (pheA) 
0(15) 



Phenylalanine 




HjN COOH 



Predicted essential 
® FBA 
0 Transposon 
© Stahl etal 



56}= Degree on Protein- 
Protein Interaction network 
(Parrish at) 

t / i = differential expression 
in virulence model (Reid eta!) J 



Figure 4 The shikimate pathway. The pathway shows the biosynthesis of aromatic amino acids, via shil<imate, as suggested by the genome 
annotation. Each step is annotated to show the following: the enzyme ('Cj' number and gene symbol), the prediction of essentiality based on 
FBA and transposon mutagenesis, the number of interactions in the PPI network, and the direction of expression from microarray data from a 
piglet model of infection. Predicted essential steps are shown in bold. 



suggested more metabolic specificity. An example is 
cJ0324, originally annotated as a ubiquinone/menaqui- 
none methyltransferase (EC 2.1.1.-) [23], the PRIAM 
tool [53] suggests specifically demethylmenaquinone 
methyltransferase activity (EC 2.1.1.163), which is more 
likely as menaquinone and a methyl substituted 



menaquinone have been isolated in C. jejuni rather than 
ubiquinone [34]. We also checked our metabolic model 
against the 'expert community' subsystem annotation 
presented in The SEED [54], which returned discrepan- 
cies for 20 of the reactions considered above. For some 
genes, the precise annotation depends on the method 
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used: returning to our glycolate to glyoxylate intercon- 
version hypothesis, Cjl213c is a putative glycolate oxi- 
dase subunit D (EC 1.1.3.15) [23,55], or an 
alkylglycerone-phosphate synthase (EC 2.5.1.26) [52], or 
a D-lactate dehydrogenase (EC 1.1.2.4) [53] making the 
degradation of glycolate into glyoxylate an uncertain 
assumption. 

An inherent limitation of the FBA method is the suit- 
ability of the objective function [56]. For instance, by 
optimising the biomass, FBA does not take into account 
the microaerophilic and capnophilic properties of C. 
jejuni. These may indeed constitute additional con- 
straints like a maximum concentration of oxygen-sensi- 
tive enzymes neglected in these simulations except as 
modelled by the limiting uptake rate of oxygen. Alterna- 
tively, trade-off functions may be more appropriate 
objective function than the optimisation of the biomass 
with these kind of micro-organisms [57]. 

The FBA method has the potential of being condition 
specific to determine the essential genes. In this study, 
they were determined in laboratory conditions. However 
as more data become available on the conditions in the 
chicken gut, the model has the potential of being used 
in situations relevant to the food industry. 

Transposon methods 

Flux balance analysis methods have a good track record 
of predicting essential genes [16], however, they only 
focus on metabolism-related genes. In this study, the 
FBA model only contained reactions linked to 388 
genes, 24% of the total genome. In vitro transposition 
has the advantage of targeting the whole genome. We 
describe the construction of two transposon mutant 
libraries in C. jejuni NCTC11168 and the mapping of a 
total of 9550 inserts in the genome, this represents a 
coverage of 5.94x using the method of Stahl and Stintzi 
[20]. Data from the combined transposon mutagenesis 
libraries predicted 233 essential genes (14% of the gen- 
ome total). While FBA only uses a subset of genes from 
the genome, the number of predicted essential genes 
was similar at 175 (11% of the genome). These numbers 
are similar to the published C. jejuni study (194 genes, 
12% of the genome) although the overlap between the 
published study and the data presented in this study is 
comparatively small (only 8 genes predicted by all three 
methods). Compared to other published essential gene 
prediction studies, the number of predicted essential 
genes is between the lower quartile and median with 
respect to total number of essential genes and percen- 
tage of the genome predicted to be essential. However, 
reviewing all published microbial essential gene predic- 
tions, we noted there was no correlation between num- 
ber of predicted essential genes and genome size (data 
not shown). Clearly the relationship between genome 



size, complexity of niche, and indispensable genes is 
complex, plus a number of caveats should always be 
considered when interpreting this sort of data: any gene 
containing at least one insert can be said to be non- 
essential under the growth conditions described. How- 
ever, the inverse logic is not true. The absence of an 
insert in a gene does not necessarily mean that the gene 
is required for growth and hence essential. Transposon 
insertion may not have occurred for a variety of reasons: 
chance, sequence bias of the transposase or transposon 
depletion during the reaction. Although no detailed stu- 
dies of the sequence preference of either transposases 
used here have been carried out and it is generally 
assumed they are essentially random, we used two dif- 
ferent transposases in an attempt to minimise any effect 
of sequence bias. An over-representation of small 
hypothetical proteins (and accordingly, small genes lack- 
ing an insert) may have resulted from the random nat- 
ure of the transposon insertion: i.e. the smaller the gene, 
the smaller the chance of transposon insertion; however, 
no gene size bias was observed when comparing genes 
with insert with genes encompassing the entire genome. 

The genome of C. jejuni NCTC 11168 totals 
1,641,481 bases and thus our libraries represent inser- 
tions in only ~0.006% of the possible positions. It is 
possible that the number of inserts identified is an 
under representation of the actual total since inserts 
close together would generate very similar sized PCR 
products that may fail to be discriminated on the agar- 
ose gels. Additionally, the abundance of individual 
mutants in the isolated pooled genomic DNA may also 
affect whether a band is visible. Since the genomic 
DNA used was isolated from pooled colonies, it is pos- 
sible that any mutant resulting in reduced growth and 
hence colony size would be under represented in the 
pooled material and as a consequence, would not be 
detected. The whole genome in vitro transposition pre- 
sented in this study should be seen as a high-through- 
put method, as opposed to a high precision method. A 
number of caveats should not be ignored: the library 
unlikely represents all possible insertion points and 
some regions may be naturally more resistant to accept- 
ing an insert. Detection is constrained by the primer 
library, which in this case was optimized for microarray 
probe generation. The PCR and agarose gel-based 
approach also suffers from more common technical 
drawbacks such as smaller PCR products are more 
favourably amplified that longer ones and accuracy of 
sizing gel fragments is not infallible. Additionally, polar 
effects due to operon structure, may result in the null 
recovery of some mutants, if the transcription and 
translation of upstream genes is perturbed by a transpo- 
son insert, as was shown for the C. jejuni fur gene 
which in itself is not essential [58] 
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Functional vs. topological determination of essential 
genes 

Another source of information which covers a high per- 
centage of the genome is the Protein-Protein Interaction 
(PPI) network of C. jejuni, obtained by yeast two-hybrid 
methods which covers 80% of the proteome [48]. Each 
protein is a node and if they interact, they are Unked by 
an edge. Essential genes have been linked to the topolo- 
gical properties of the PPI network, as it has been 
shown that essential genes are more lil<ely to be hubs of 
the PPI network than by chance [59,60]. We have inves- 
tigated whether there is a correlation between the 
degrees of the nodes the PPI networI< of C. jejuni and 
essential genes determined using FBA and transposon 
mutagenesis. No correlation was observed, contrarily to 
what was predicted by Parrish et al. [48]. They based 
their analysis on putative essential proteins which were 
orthologs of Escherichia coli and Bacillus subtillis essen- 
tial proteins. However, it has been shown that these bac- 
teria do not share many essential genes, especially B. 
subtillis [20]. More recent analyses of binary PPI net- 
works suggest that the relationship between hubs and 
essential proteins is more complex, with hubs being cor- 
related to genetic pleiotrophy; that is hubs are proteins 
that have many phenotypes when the gene encoding 
that protein is deleted [61]. The interpretation of PPI 
networks remains ambiguous and models to explain the 
universal structure of PPI networks have been proposed 
to be related to evolutionary principles such as duplica- 
tion and mutation of a few ancestors [62] or to the 
potential of proteins to bind together because of their 
physical properties, such as binding affinity and folding 
[63]. 

The shikimate pathway 

Whilst the genome-wide comparison of gene essentiality 
with PPI hubs has not been fruitful in this study, the 
shikimate pathway in particular exhibits a large number 
of proteins with high degrees (see Figure 4). Since the 
interpretation of physical PPI network remains ambigu- 
ous, the high degrees of this pathway could be inter- 
preted in multiple ways. Nonetheless, the combination 
of essential gene prediction methods has drawn focus to 
this particular pathway as a potential target for interven- 
tion, which should be investigated further using conven- 
tional genetic tools. The shikimate pathway has been 
the subject of antimicrobial research in previous studies 
[21,64,65]. As reported by other groups, the shikimate 
pathway is present in bacteria, plants, and fungi, but 
absent in humans, making it the target for novel antimi- 
crobials and herbicides [21]. More specifically, Zucko et 
al. show that a complete shikimate pathway is present 
in 76% of 442 bacterial genomes studied, although lar- 
gely incomplete in Archaea [66]. Two E. coli studies also 



identify essential genes from this pathway: aroH, aroK 
were predicted essential by Gerdes et al. [67] and aroB, 
aroD, aroE, aroC, and pheA plus the entire trpABCDE 
operon were predicted essential by Joyce et al. [68]. The 
aroD gene is also predicted to be essential in the refined 
H. pylori metabolic model [18]. 

It is noteworthy that without any prior expectations of 
pathway targets, the methods presented in this work 
point towards a known target pathway for novel antimi- 
crobial interventions. However, the ultimate validation 
of our approach requires further laboratory investigation 
that is beyond the scope of this paper. 

Conclusions 

We have presented the first curated metabolic model of 
the important pathogen Campylobacter jejuni and dis- 
cussed insights into the organism's metabolism. Flux 
Balance Analysis used in combination with a transposon 
mutagenesis library has been used to make predictions 
about essential genes, and these predictions have been 
further informed with reference to other published stu- 
dies, such as the PPI dataset. This analysis has provided 
the basis for further laboratory investigations and sug- 
gests a re-evaluation of a previously scrutinized pathway, 
which may turn out to be the Achilles heel of this food- 
borne pathogen. 

Methods 

Reconstruction of the metabolic network for FBA 

The reconstruction of the metabolic network is based 
on the genome sequence of C. jejuni NCTC 11168 [23] 
and a recently curated and updated annotation [22]. 
Where C. jejuni had H. pylori orthologs (they share 
about 2/3 of their genome), the reactions were taken 
from the H. pylori model iIT341 GSM/GPR [18] with 
the same assumptions. The reactions were also checked 
against on-line databases [69,70] and also the literature 
on C. jejuni. In particular, the reactions for the central 
metabolism and respiration were drawn from a recent 
review [24]. The conventions for the names of chemicals 
and reactions were kept as close as possible to the H. 
pylori model ilT341 GSM/GPR. The reactions added to 
the model were elementary and charge balanced based 
on a neutral intracellular pH. Where possible, the reac- 
tions were associated to genes that encode the proteins 
which catalyse them, with Boolean relationships. This 
means that for reactions catalysed by isozymes or differ- 
ent proteins, the "and", "or" Boolean operations between 
the genes was used [71]. 

Validation test for the model 

To check that the model allowed the bacteria to meta- 
bolise the expected substrates, it was tested against BIO- 
LOG microplates where the respiration of C. jejuni fed 
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on different carbon source was measured [32], and, 
other literature data [37,42,43,72]. Due to the scarcity of 
data in the Uterature, the model could only be tested 
with oxygen for respiration. 

Prediction of essential genes by FBA 

FBA consists of the computation of the possible fluxes, v, 
going through the reactions of the metabolic network at 
steady-state. The system of equations is defined by the 
stoichiometric matrix, S, containing the stoichiometric 
coefficients of the metabolic reactions, with m being the 
number of metabolites, and n the number of fluxes [15]. 
At steady-state (i.e. during balanced growth when the 
biomass composition is assumed to be constant), ^.v = 0. 
There are more fluxes than metabolites (n >m), so the 
system is underdetermined. The fluxes are bound by 
thermodynamic feasibility so the space of solutions is a 
convex space [15]. It has been shown with some organ- 
isms, notably with E. coli [16], that the flux through the 
biomass is optimised for the uptake of nutrients during 
balanced growth, so the biomass equation can be used as 
an objective function to reduce the space of solutions. 
With less well characterised micro-organisms like C. 
jejuni, the biomass composition is not known quantita- 
tively. The biomass composition was assumed to be the 
same as for H. pylori except for (a) vitamin B6, which 
was added to the equation as the genome is predicted to 
encode the entire biosynthetic pathway bar one gene and 
(b) thiamine, for which the active form was assumed to 
be thiamine diphosphate rather than thiamine. 

Exploration of the space of solution: Even after 
appointing an objective function, there may be more 
than one solution to the optimisation problem. These 
solutions are referred to as silent phenotypes as the 
growth rate is the same but the internal organisation of 
the fluxes is different [73]. Due to the design of the 
algorithm, the solution returned by simplex linear pro- 
gramming generally minimizes the number of fluxes. 

To estimate whether a gene is essential or not, the 
ratio, Gr, of the biomass flux when the gene is absent to 
the biomass flux when the gene is present was calcu- 
lated in a given environment [74]. A gene was consid- 
ered essential if Gr <= 10' (arbitrary value). 

Medium composition: Minimal medium was used to 
validate the model. Its substrates were derived from the 
BIOLOG medium experiments http://www.biolog.com. 
The experiments to screen for essential genes were car- 
ried out in Brucella medium which is a rich medium con- 
taining pancreatic digest of casein, peptic digest of animal 
tissues, dextrose, yeast extract, sodium chloride and 
sodium bisulfite [75]. Since the composition of this med- 
ium is unknown, the medium was assumed to be similar 
to yeast extract [76], the composition used for simula- 
tions is indicated in the table in the Additional file 5: 



medium composition. The chemicals are allowed to enter 
or leave the system through exchange reactions. The 
exchange fluxes of the carbon source were fixed to a 
maximum of 20 mmol/g dry weight of biomass/h, which 
is close to a maximum uptake rate for E. coli [77] and 5 
mmol/g dry weight of biomass/h for oxygen, which is 
about a fourth of the maximum fluxes measured in air 
for E. coli as C. jejuni is microearophilic. The other nutri- 
ents present in the medium were assumed to be non-lim- 
iting with an arbitrary uptake higher boundary of 1,000 
mmol/g dry weight of biomass/h. 

All the calculations related to FBA were carried out 
with the COBRA Toolbox [74] in Matlab (version 
R2010b, Mathworks, Inc.) equipped with the glpk solver 
for linear programming [78]. 

Transposon mutagenesis 

Two in vitro transposition libraries were constructed 
using mariner transposase, essentially as described by 
Gaskin and van Vliet [79] and Tn7 transposase from 
New England Biolabs with C. jejuni NCTC11168 geno- 
mic DNA. These were introduced into C. jejuni NCTC 
11168 cells by natural transformation [79] and plated 
onto Blood Agar Base no.2 (Oxoid) plates supplemented 
with 5% v/v defribinated horse blood and kanamycin 50 
|ig/ml. After -48 hours incubation at 42°C under micro- 
aerophilic conditions (5% oxygen, 10% carbon dioxide, 
85% nitrogen) colonies were pooled and genomic DNA 
extracted using QIAgen genomic-tips (QIAgen). 

Mapping of transposon insertions 

Genomic DNA from the pooled colonies was used as 
template in polymerase chain reactions using a transpo- 
son specific primer and individual gene specific primers. 
Briefly, 50 [il reactions were set up using 100 ng geno- 
mic DNA, 10-50 pmol of each primer and 25 i^l of Hot- 
StarTaq mix (QIAgen). Cycling conditions were 95°C 
for 15 min, followed by 30 cycles of 95°C 30 sec, 50°C 
30 sec, 72°C 90 sec and a final 72°C 15 min extension 
step. Aliquots from each reaction were run on 0.8% 
agarose gels, which were stained with ethidium bromide. 
Gel images were captured using a GeneDoc system 
(Anachem). The sizes of observed bands were calculated 
using Labimage (Kapelan Bio-Imaging GmbH) and this 
data was inputted into Excel (Microsoft). Based on the 
transposon study, 47 insertional inactivation mutants 
were created using conventional methods to validate the 
Tn7/Mariner findings. 

Additional material 



Additional file 1: Tables with the reactions included in the model, 
the metabolites and the references use to reconstruct the model In 

the reaction table, the EC numbers of the reactions are indicated as well 
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as the H. pylori orthologs as appropriate. The reactions are associated to 
genes with Boolean rules. The reactions were categorised in pathways 
similarly to the model of H. pylori. 

Additional file 2: This table shows the substrates that were 
predicted to be metabolised according to the proposed model and 
compared to experimental data from the literature 

Additional file 3: This table shows whether genes were predicted to 
be dispensable by different methods: FBA predictions, number of 
inserts of the transposon mutagenesis for the two transposon 
libraries of this study and whether inserts were present in the 
transposon mutagenesis of an independent study [20] The degrees 
of the protein in the PPI network and known mutants are also indicated. 

Additional file 4: This table is a list of the proteins encoded by 
genes without a transposon insert as determined in this study 
(putative essential genes), classified according to the KEGG, BioCyc, 
and JCVI databases 

Additional file 5: Table of substrates allowed in the system to 
determine essential genes in silico 
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